Abstract

Asking whether parameters can be identified from one or several datasets is an important step towards sound model development. Most mechanistic predator-prey modelling has attempted either parameterization from process rate data or inverse modelling (identifying parameters solely from time series). Here, we take a median road: we aim at identifying the potential benefits of combining datasets, when both population growth and predation processes are viewed as stochastic. We fit a discrete-time, stochastic predator-prey model of the Leslie type to time series and functional response data simulated from the model. Our model has both environmental stochasticity in the growth rates and interaction stochasticity, i.e., a stochastic functional response. We examine what the functional response data brings to the quality of the estimates, and whether estimation would be possible (for various time series length) solely with time series data. Both bayesian and frequentist estimation are performed, and in both cases we report diagnostics of identifiability of the various parameters. Our results suggest that if the functional response is indeed only slightly stochastic, identifying parameters requires in fact functional response data as a complement to time series of counts [more stuff on results here]. Our framework to combine data sets is general may be extended to other interaction types for which data on both interaction rates and population counts are available.

Introduction

tentative list

Models and statistical methods

Predator-prey model in discrete time

We chose a model with a numerical response of the Leslie type (though similar analyses can be done for Rosenzweig MacArthur models can be done as well, see Supplement S1). A Beverton-Holt type density-dependence for the prey was chosen to avoid cycles in the prey in absence of the predator, so that the model behaviour is more reminiscent of its continuous-time counterpart (see Weide et al. for more on connecting continous to discrete time models). Our model writes

\[ \begin{aligned} N_{t+1} & = N_{t}\frac{e^{r+\epsilon_{1t}}}{1+\gamma N_{t}}\exp\left(-G(N_{t})\frac{P_{t}}{N_{t}}\right),\,\epsilon_{1t}\sim\mathcal{N}(0,\sigma_{1}^{2})\label{eq:prey_discreteLeslieMay} \end{aligned} \]

\[ \begin{aligned} P_{t+1} & = P_{t}\frac{e^{s+\epsilon_{2t}}}{1+qP_{t}/N_{t}},\,\epsilon_{1t}\sim\mathcal{N}(0,\sigma_{2}^{2})\label{eq:predator_discreteLeslieMay} \end{aligned} \]

The roots of this discrete-time formulation can be traced back to Leslie (1948),Leslie & Gower (1960) who included a Beverton-Holt regulation for the prey and predator to make discrete-time models more similar to their continous-time counterparts. We consider additionally a saturating functional response \(G(N_t)\) here, which makes our model resembles the continuous time models of Tanner (1975) or May (1973) and later Turchin & Hanski (1997), whose notations we have kept. One of the advantages of this model over the Rosenzweig-MacArthur is that it can be parameterized using the observed intrinsic growth rates as Tanner (1975) did. Parameter values were inspired by small mammals (Turchin & Hanski 1997), which we modified slightly [these are not the same models anyway]. The division by \(N_t\) in \(\exp(-G(N_t)P_t/N_t)\) expresses the fact that all terms within the exponential are on the prey fitness scale (per capita mortality). This is similar to assuming a Nicholson-Bailey type predation term (e.g., Weide et al. 2018).

Until now, we have not specified a model for the functional response \(G(N_{t})\). With a deterministic functional response (FR), we have a classic stochastic predator-prey model with log-normal environmental noise but an otherwise `deterministic skeleton’. A stability analysis of the deterministic model is performed in Appendix A1, later used to determine which parameters lead to quasi-cycles or a noisy limit cycles in the stochastic model.

However, here we consider a data-generating process where the functional response is not deterministic but itself stochastic, as in the following equation for a Holling type II functional response:

\[ \begin{equation}\label{eq:FR} G_{t}|N_t\sim\mathcal{N}\left(\frac{CN_{t}}{D+N_{t}},\sigma_{3}^2\right) \end{equation} \]

This corresponds to a case where there is mild Gaussian fluctuations around the functional response. Because there can be substantial noise on the FR (see e.g. plots in Fig. 5 of Rosenbaum & Rall 2018), we also consider more complicated cases where the parameters \(C\) and \(D\) are themselves allowed to vary in time in Supplement S2 [I was thinking to only do this in a bayesian setting for the estimation part, since these models need more constraining].

General statistical methodology

Although we apply here data integration to a predator-prey case, the methodology is more general and can in principle be applied to any addiition of auxiliary information (interaction rate, demographic rate) which is available over time and not simply produced by the count data. The approach has similarities to Integrated Population Modelling (Besbeas et al. 2002; Maunder 2004; Péron & Koons 2012) and data fusion approaches in ecosystem science (Niu et al. 2014). In a predator-prey context, we can cite the recent endeavour of Ferguson et al. (2018) to combine count and isotopic (diet) data, but their model differ in that it does not view interaction rate as the result of a stochastic process.

We illustrate the method with the predator-prey case where log-densities for both predator and prey are gathered in a log-count vector \(\mathbf{x}_{t}=(\ln(N_{t}),\ln(P_{t}))^{T}\). Auxiliary information on the functional response, or rather the observed kill rate per predator is denoted \(G_t\). To this can be added other demographic (vital) rates \(R_{t}\) that are stacked in a vector as well, \(\mathbf{a}_{t}=(G_{t}, R_{t})\). Currently we use \(\mathbf{a}_{t}=G_{t}\) but it may useful to add more information in other applications; hence the derivation is kept general.

We consider a discrete-time dynamical system (nonlinear difference equation). The population dynamics part of the model gives us the probability law of \(\mathbf{x}_{t+1}|(\mathbf{x}_{t},\mathbf{a}_{t})\), since the counts at time \(t+1\) depend both on past abundances and the interaction or demographic data based on the chosen mechanistic model. We also know the probability law of \(\mathbf{a}_{t}|\mathbf{x}_{t}\) (in our simple predator-prey case, the functional response). We can therefore write down easily the joint likelihood for both data sources in quite general terms, denoting \(\mathbf{X}=(\mathbf{x}_{1},...,\mathbf{x}_{t_{m}})\) and \(\mathbf{A}=(\mathbf{a}_{1},...,\mathbf{a}_{t_{m}})\):

\[ \begin{equation} \mathcal{L}(\mathbf{X},\mathbf{A})=p(\mathbf{x}_{1},\mathbf{a}_{1})\prod_{t=1}^{t_{m}-1}p_{C}(\mathbf{x}_{t+1},\mathbf{a}_{t+1}|\mathbf{x}_{t},\mathbf{a}_{t}) \end{equation} \]

where \(p(y)\) and \(p_{C}(y)\) are continuous probability densities for the vector \((\mathbf{x},\mathbf{a})\) and its conditional pdf, respectively. The conditional pdf can be further decomposed using the chain rule

\[ p(\mathbf{x}_{t+1},\mathbf{a}_{t+1}|\mathbf{x}_{t},\mathbf{a}_{t})=p_{2}(\mathbf{a}_{t+1}|\mathbf{x}_{t+1},\mathbf{x}_{t},\mathbf{a}_{t})\times p_{1}(\mathbf{x}_{t+1}|\mathbf{a}_{t},\mathbf{x}_{t})=p_{2}(\mathbf{a}_{t+1}|\mathbf{x}_{t+1})\times p_{1}(\mathbf{x}_{t+1}|\mathbf{a}_{t},\mathbf{x}_{t}) \]

where \(p_{1}(y)\) is given by the dynamical system and \(p_{2}(y)\) is the functional response model (or a demographic model). We therefore end up with a model

\[ \begin{equation}\label{full-likelihood} \mathcal{L}(\mathbf{X},\mathbf{A})=p_{1}(\mathbf{a}_{1}|\mathbf{x}_{1})p(\mathbf{x}_{1})\prod_{t=1}^{t_{m}-1}\underbrace{p_{1}(\mathbf{x}_{t+1}|\mathbf{a}_{t},\mathbf{x}_{t})}_{\text{dynamical system}}\times\underbrace{p_{2}(\mathbf{a}_{t+1}|\mathbf{x}_{t+1})}_{\text{auxiliary information model}} \end{equation} \]

where we swapped \(p_{1}\) and \(p_{2}\) to get the dynamical system model first.

Application to the stochastic predator-prey model

In the simplest case highlighted by our discrete-time dynamical systems of the sections above, \(p_{1}(x|y)=p_{11}(x_{1}|y)p_{12}(x_{2}|y)\) is the product of the two gaussian pdf for log-densities conditional on past densities and auxiliary information. Using the equations \eqref{eq:prey_discreteLeslieMay} and \eqref{eq:predator_discreteLeslieMay}, we obtain the following difference equation on the log-scale:

\[ \begin{equation} n_{t+1}|\mathbf{x}_{t}=\ln(N_{t+1})|\mathbf{x}_{t}\sim\mathcal{N}(\mu_{1t},\sigma_{1}^{2})\,\,,\mu_{1t}=n_{t}+r-G_{t}\frac{P_{t}}{N_{t}}-\ln(1+\gamma N_{t}) \end{equation} \] and \[ \begin{equation} p_{t+1}|\mathbf{x}_{t}=\ln(P_{t+1})|\mathbf{x}_{t}\sim\mathcal{N}(\mu_{2t},\sigma_{2}^{2})\,\,,\mu_{2t}=p_{t}+s-\ln(1+qP_{t}/N_{t}) \end{equation} \] with a functional response model (\(p_{2}\)) also given by a Gaussian pdf (this is the simplest case where we assume near Gaussian noise on the FR, see Discussion)

\[ G_{t}|N_t \sim\mathcal{N}(\mu_{3t},\sigma_{3}^{2}),\;\mu_{3t}=\frac{CN_{t}}{D+N_{t}} \]

Model scenarios

We considered the following parameter sets for the model [illustration needed, 4 panels: (N(t), P(t), F(N), N vs P) for each parameter set, illustrate the first one in the MS, second one in Appendix?]

Parameter Quasi-cycles Noisy LC
C 2.5 15.0
D 1.0 0.25
Q 10 10

The rest of the parameters are \(r=2,s=0.5,K=1,\sigma_1=\sigma_2=\sigma_3=0.05\).

The first parameter set corresponds to a forced focus or quasi-cycles, i.e. sustained oscillations that arise the interaction between noise and dampened oscillations to the fixed point in the deterministic model. We also consider a noisy limit cycle, i.e., parameters that give rise to a limit cycle without the noise, so that the cycle is still very regular but perturbed by the noise.

These two sets of parameters are crossed with several modalities of data availability: - we consider a time series length T=1000, 100 [should I add T=50 and 25?] - we consider that we have functional response data over 100%, 25%, or 0% of the time series data. This is meant to emulate common scenarios in which the kill rates are not monitored over the whole time series, and quantify the benefits of adding just a little FR data.

Note that in the case without FR data, we fit the model without noise in the functional response [this is something we need to discuss in depth Olivier], which is what is usually done in this case (Turchin & Ellner (2000); newer stuff?).

For each parameter x data availability scenario, we fit the models in both bayesian and frequentist settings.

Model fitting implementation

We fitted the model by MCMC in JAGS and also derived mathematically its likelihood, which we then optimised using the BFGS algorithm [see comments below on Nelder-Mead as well] with optim() in R. Several starting values were considered.

Identifiability was inspected in different ways. In a frequentist setting, we computed the Hessian matrix of the negative log-likelihood \(H(\theta,Y)\) where \(Y = (X,A)\) following eq. \eqref{full-likelihood}, at the estimated parameter set value \(\hat{\theta}\). Formally, \(H(\theta,Y)\) is the observed Fisher Information Matrix (FIM). When the Hessian \(H(\theta,Y)\) has non-zero eigenvalues (and is therefore invertible), the models are identifiable in the sense that no two parameter sets can give the same log-likelihood value in a neighborhood of the optimum (Rothenberg 1971). We also compute the expected FIM at the true parameter value. This value is relevant theoretically because asymptotically, as the time series length \(T\) gets large, \(\theta \sim \mathcal{N}(\theta_{\text{true}},\mathcal{I}_T(\theta_{\text{true}})^{-1})\) where \(\mathcal{I}_T(\theta_{\text{true}})\) is the FIM for a time series of length \(T\), obtained by taking expectations over many time series of length \(T\). In the case where the time series length \(T \rightarrow \infty\), we have most likely a convergence of \(H(\theta,y^T) \rightarrow \mathcal{I}_T(\theta_{\text{true}})\) [should I try to prove this properly btw? I think this should be doable by splitting the long time series into smaller chunks, and then proving that the combined log-likelihood is obtained by summing LL for all chunks. Perhaps we can use - among other things - that for a partition \((T_1,..., T_n)\) of the time series of total length \(T\) into chunks of size \(T/n\), we have \(\mathcal{I}_T(\theta_{\text{true}}) = n \mathcal{I}_{T/n}(\theta_{\text{true}})\)].

We also computed the expected pairwise correlation between the parameters: the variance-covariance matrix of the parameters is defined by \(\Sigma = H^{-1}\) so that we get easily the expected pairwise parameter correlation matrix \((\rho_{ij})\), with each element defined as \(\frac{\Sigma_{ij}}{\sqrt{\Sigma_{ii}\Sigma_{jj}}}\).

In a bayesian setting, we inspected the correlations in the Markov chains for pairs of parameters, which translates into pair posterior distributions of parameters. Parameters whose chains were too positively or negatively correlated were considered not identifiable separately.

Results

[so far this is very preliminary and I do not consider all the abovementioned modalities of the analysis. We focus for now on T=1000 and the comparison with/without FR data. In both bayesian and frequentist setting]

Data simulation

Here, we describe how to simulate the stochastic model [may be ignored in the paper but useful at that stage].

rm(list=ls())

### Parameters for simulation model

n.years<-1000   # Number of years - 25 first, perhaps use 50 or 100 / worked very well with almost no process error on the real scale
N1<-1           # Initial pop size
P1<-0.1
K<-1            # threshold dd 
beta<-1         # density-dependence exponent
rmax_V<-2           # Max AVERAGE growth rate (thus not a true max...)
rmax_P<-0.5
sigma2.proc<-0.05       # worked well with 0.005 as well
# Process sigma on the log-scale, use the Peretti et al. value. 0.005


### FR and predator parameters
C<-2.5
D<-1
Q<-10

### Simulation of data
#set.seed(42) 
set.seed(41)

y<-N<-P<-FR<-numeric(n.years)
N[1]<-N1
P[1]<-P1
FR[1]<-C*N[1]/(D+N[1])

rV<-rnorm(n.years-1,rmax_V,sqrt(sigma2.proc))
rP<-rnorm(n.years-1,rmax_P,sqrt(sigma2.proc))
FRnoise<-rnorm(n.years,0,sqrt(sigma2.proc))

for (t in 1:(n.years-1)){

  N[t+1]<-N[t]*(exp(rV[t])/(1+(N[t]/K)^beta))*exp(-FR[t]*P[t]/N[t])
  P[t+1]<-P[t]*exp(rP[t])/(1+P[t]*Q/N[t])
  FR[t+1]<-(C*N[t+1]/(D+N[t+1])) + FRnoise[t+1] #after for update
}

## Plotting time series of abundances and FR
par(mfrow=c(2,2))
plot(1:n.years,N,type="b")
plot(1:n.years,P,type="b")
#curve(dbeta(x,a,b),from=0, to=1)
plot(N,FR)
plot(log(N),log(P))

Bayesian analysis

Below we reproduce the JAGS code that was used to fit the model to both predator-prey count data and FR data.

library("R2jags")      # Load R2jags package

#seq(0,1,0.01)
# Bundle data
jags.data <- list(T=n.years,logN=log(N),logP=log(P),FR=FR)

sink("predprey.txt")
cat("
    model {
    
    # Priors and constraints
    logN[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
    logP[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale

    # Priors prey population dynamics
    r_V ~ dnorm(1,0.001) # below the truth, rather flat prior
    K_V ~ dunif(0.2,10)
    sigma_V ~ dunif(0.01,5) # rather vague 
    sigma2_V<-pow(sigma_V, 2)
    tau_V<-pow(sigma_V,-2)

 
     #Priors predator population dynamics
    Q ~ dgamma(0.1,0.1)
    r_P ~ dnorm(1,0.1)
    sigma_P ~ dunif(0.01,2) # rather vague 
    sigma2_P<-pow(sigma_P, 2)
    tau_P<-pow(sigma_P,-2)
    
    #Priors predation parameters 
    tau_FR ~ dgamma(.01,.01)
    C~dgamma(.01,.01) # uninformative priors OK for that one
    D~dgamma(.01,.01)
    # check model Leslie to see how she specified priors...

    # Likelihood
    # state process

    for (t in 1:(T-1)){        

    FRUpdate[t] <- C*N[t]/(D+N[t]) #functional response equation, including noise
    FR[t] ~  dnorm(FRUpdate[t],tau_FR) #small trick to use FR data

    logNupdate[t] <- logN[t] + r_V -log(1+N[t]/K_V) -FR[t]*exp(logP[t])/N[t]
    logN[t+1] ~ dnorm(logNupdate[t],tau_V)
    N[t]<-exp(logN[t])
    # for some reason, log(1+(exp(r_V)-1)*N[t]/K_V) was not working

    logP[t+1]~ dnorm(logPupdate[t],tau_P)
    logPupdate[t] <- logP[t] + r_P - log(1+exp(logP[t])*Q/exp(logN[t]) )  
    
    }
    
    }
    ",fill=TRUE)
## 
##     model {
##     
##     # Priors and constraints
##     logN[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
##     logP[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
## 
##     # Priors prey population dynamics
##     r_V ~ dnorm(1,0.001) # below the truth, rather flat prior
##     K_V ~ dunif(0.2,10)
##     sigma_V ~ dunif(0.01,5) # rather vague 
##     sigma2_V<-pow(sigma_V, 2)
##     tau_V<-pow(sigma_V,-2)
## 
##  
##      #Priors predator population dynamics
##     Q ~ dgamma(0.1,0.1)
##     r_P ~ dnorm(1,0.1)
##     sigma_P ~ dunif(0.01,2) # rather vague 
##     sigma2_P<-pow(sigma_P, 2)
##     tau_P<-pow(sigma_P,-2)
##     
##     #Priors predation parameters 
##     tau_FR ~ dgamma(.01,.01)
##     C~dgamma(.01,.01) # uninformative priors OK for that one
##     D~dgamma(.01,.01)
##     # check model Leslie to see how she specified priors...
## 
##     # Likelihood
##     # state process
## 
##     for (t in 1:(T-1)){        
## 
##     FRUpdate[t] <- C*N[t]/(D+N[t]) #functional response equation, including noise
##     FR[t] ~  dnorm(FRUpdate[t],tau_FR) #small trick to use FR data
## 
##     logNupdate[t] <- logN[t] + r_V -log(1+N[t]/K_V) -FR[t]*exp(logP[t])/N[t]
##     logN[t+1] ~ dnorm(logNupdate[t],tau_V)
##     N[t]<-exp(logN[t])
##     # for some reason, log(1+(exp(r_V)-1)*N[t]/K_V) was not working
## 
##     logP[t+1]~ dnorm(logPupdate[t],tau_P)
##     logPupdate[t] <- logP[t] + r_P - log(1+exp(logP[t])*Q/exp(logN[t]) )  
##     
##     }
##     
##     }
## 
sink()


# Initial values
inits <- function () {
  list(sigma_V=runif(1,0.1,2), sigma_P=runif(1,0.1,2), r_V=runif(1,0.1,2),r_P=runif(1,0.1,2), K_V=runif(1,0.2,8), Q=runif(1,0,5),tau_FR=runif(1,1,10),C=runif(1,10,100),D=runif(1,0.01,0.1))}

# Parameters monitored
#parameters<-c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","b","C","D","logNupdate","logPupdate","FR")
parameters<-c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","tau_FR","C","D")


# MCMC settings
nc <- 3 #number of chains
nb <- 14000 # “burn in”
#ni <- 14000# “number of iterations” # that's for a symmetric distrib...
ni<-34000
nt <- 10 # “thinning”

We do not evaluate it here because this is too long, we load the data later on.

# run model
out <- jags(jags.data, inits, parameters, "predprey.txt", n.chains=nc, n.thin=nt, n.iter=ni, n.burnin=nb, working.directory = getwd())
print(out, dig = 2)

The model without the functional response is the following:

### Now try to fit a model without the FR data. 
sink("predprey_without_sepFR.txt")
cat("
    model {
    
    # Priors and constraints
    logN[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
    logP[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
    
    # Priors prey population dynamics
    r_V ~ dnorm(1,0.001) # below the truth, rather flat prior
    K_V ~ dunif(0.2,10)
    sigma_V ~ dunif(0.01,5) # rather vague 
    sigma2_V<-pow(sigma_V, 2)
    tau_V<-pow(sigma_V,-2)
    
    
    #Priors predator population dynamics
    Q ~ dgamma(0.1,0.1)
    r_P ~ dnorm(1,0.1)
    sigma_P ~ dunif(0.01,2) # rather vague 
    sigma2_P<-pow(sigma_P, 2)
    tau_P<-pow(sigma_P,-2)
    
    #Priors predation parameters 
    
    C~dgamma(.01,.01) # 
    D~dgamma(.01,.01)
    # check model Leslie to see how she specified priors...
    
    # Likelihood
    # state process
    
    for (t in 1:(T-1)){        

    
    logNupdate[t] <- logN[t] + r_V -log(1+N[t]/K_V) -C*exp(logP[t])/(D+N[t])
    logN[t+1] ~ dnorm(logNupdate[t],tau_V)
    N[t]<-exp(logN[t])
    
    # for some reason, log(1+(exp(r_V)-1)*N[t]/K_V) was not working
    logP[t+1]~ dnorm(logPupdate[t],tau_P)
    logPupdate[t] <- logP[t] + r_P - log(1+exp(logP[t])*Q/exp(logN[t]) )  
    
    }
    
    }
    ",fill=TRUE)
## 
##     model {
##     
##     # Priors and constraints
##     logN[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
##     logP[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
##     
##     # Priors prey population dynamics
##     r_V ~ dnorm(1,0.001) # below the truth, rather flat prior
##     K_V ~ dunif(0.2,10)
##     sigma_V ~ dunif(0.01,5) # rather vague 
##     sigma2_V<-pow(sigma_V, 2)
##     tau_V<-pow(sigma_V,-2)
##     
##     
##     #Priors predator population dynamics
##     Q ~ dgamma(0.1,0.1)
##     r_P ~ dnorm(1,0.1)
##     sigma_P ~ dunif(0.01,2) # rather vague 
##     sigma2_P<-pow(sigma_P, 2)
##     tau_P<-pow(sigma_P,-2)
##     
##     #Priors predation parameters 
##     
##     C~dgamma(.01,.01) # 
##     D~dgamma(.01,.01)
##     # check model Leslie to see how she specified priors...
##     
##     # Likelihood
##     # state process
##     
##     for (t in 1:(T-1)){        
## 
##     
##     logNupdate[t] <- logN[t] + r_V -log(1+N[t]/K_V) -C*exp(logP[t])/(D+N[t])
##     logN[t+1] ~ dnorm(logNupdate[t],tau_V)
##     N[t]<-exp(logN[t])
##     
##     # for some reason, log(1+(exp(r_V)-1)*N[t]/K_V) was not working
##     logP[t+1]~ dnorm(logPupdate[t],tau_P)
##     logPupdate[t] <- logP[t] + r_P - log(1+exp(logP[t])*Q/exp(logN[t]) )  
##     
##     }
##     
##     }
## 

We assume in this second model that the FR is deterministic not stochastic. This is something that we might discuss.

Both models are fairly long to run for T=1000 (approx. 1h) - so we stored the results instead (not in GitHub, a bit heavy).

Now we analyse those results, starting with what the functional response look like. In orange, a least-square fit to the FR data, in blue, the fitted FR data for the Bayesian model with stochastic functional response, in red, the fitted model for the Bayesian model with deterministic functional response (hence indirectly inferred from the time series). This reveals serious mismatch for the red curve whilst the blue one is very good.

CEb<-out$BUGSoutput$mean$C
DEb<-out$BUGSoutput$mean$D

CEr<-out2$BUGSoutput$mean$C
DEr<-out2$BUGSoutput$mean$D

### Fit functional response
par(mfrow=c(1,1))

fr_fit<-nls(FR~CE*N/(DE+N),start=list(CE=1,DE=1))
CE<-coef(fr_fit)[1]
DE<-coef(fr_fit)[2]
plot(N,FR,ylim=c(0,max(FR,na.rm=T)))
lines(N,CE*N/(DE+N),col="orange") # Frequentist fit of just the FR data in black
lines(N,CEb*N/(DEb+N),col="blue") # Bayesian fit TS data + FR data
lines(N,CEr*N/(DEr+N),col="red") # Bayesian fit only TS data

### plot densities
library(mcmcplots)

### here is a density plot for the stochastic FR model
denplot(out,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))

We now plot the posterior densities for the deterministic FR model

denplot(out2,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))

Traceplots for the stochastic FR model, showing good convergence

### Trace plots
traplot(out,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))

Traceplots for the deterministic FR model, showing wild excursions in parameter space

traplot(out2,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))

Let’s compare diagnostics

For the stochastic FR model

print(out,dig=2)
## Inference for Bugs model at "predprey.txt", fit using jags,
##  3 chains, each with 34000 iterations (first 14000 discarded), n.thin = 10
##  n.sims = 6000 iterations saved
##          mu.vect sd.vect    2.5%     25%     50%     75%   97.5% Rhat
## C           2.51    0.04    2.43    2.48    2.51    2.54    2.60 1.00
## D           1.01    0.10    0.82    0.94    1.01    1.08    1.22 1.00
## K_V         0.98    0.19    0.63    0.84    0.97    1.11    1.37 1.07
## Q          10.27    0.80    8.77    9.72   10.24   10.80   11.92 1.00
## r_P         0.50    0.03    0.44    0.48    0.50    0.53    0.57 1.00
## r_V         2.04    0.17    1.74    1.91    2.02    2.16    2.40 1.06
## sigma2_P    0.05    0.00    0.05    0.05    0.05    0.06    0.06 1.00
## sigma2_V    0.05    0.00    0.05    0.05    0.05    0.05    0.05 1.00
## tau_FR     20.71    0.93   18.92   20.07   20.69   21.34   22.57 1.00
## deviance -429.74    4.16 -435.87 -432.80 -430.41 -427.34 -419.93 1.00
##          n.eff
## C         6000
## D         6000
## K_V         43
## Q          810
## r_P        880
## r_V         45
## sigma2_P  3500
## sigma2_V  6000
## tau_FR    6000
## deviance   610
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 8.6 and DIC = -421.1
## DIC is an estimate of expected predictive error (lower deviance is better).

For the deterministic FR model

print(out2,dig=2)
## Inference for Bugs model at "predprey_without_sepFR.txt", fit using jags,
##  3 chains, each with 34000 iterations (first 14000 discarded), n.thin = 10
##  n.sims = 6000 iterations saved
##          mu.vect sd.vect    2.5%     25%     50%     75%   97.5% Rhat
## C           8.91   14.52    1.44    1.91    2.19    6.41   53.38 2.70
## D          15.83   33.25    0.00    0.00    0.00   10.62  119.90 2.86
## K_V         1.16    0.51    0.55    0.77    0.97    1.50    2.38 2.86
## Q          10.27    0.80    8.79    9.71   10.23   10.78   11.94 1.00
## r_P         0.50    0.03    0.45    0.48    0.50    0.53    0.57 1.00
## r_V         1.96    0.34    1.34    1.67    2.01    2.22    2.55 2.82
## sigma2_P    0.05    0.00    0.05    0.05    0.05    0.06    0.06 1.00
## sigma2_V    0.05    0.00    0.05    0.05    0.05    0.05    0.05 1.00
## deviance -231.95    4.06 -239.16 -234.67 -232.32 -229.63 -222.79 1.16
##          n.eff
## C            4
## D            4
## K_V          4
## Q         3200
## r_P       2700
## r_V          4
## sigma2_P  6000
## sigma2_V  1600
## deviance    18
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 7.3 and DIC = -224.7
## DIC is an estimate of expected predictive error (lower deviance is better).

[I think in this context, if we fit the deterministic FR model to a stochastic FR simulation, that it makes sense to stick to a small noise on the FR, so that people see how a seemingly small noise can change the results]

Let us now consider potential correlations in the posteriors

### Plot pair posterior densities
postsamples=cbind(out$BUGSoutput$sims.list$r_V,
out$BUGSoutput$sims.list$K_V,
out$BUGSoutput$sims.list$r_P,
out$BUGSoutput$sims.list$Q,
out$BUGSoutput$sims.list$C,
out$BUGSoutput$sims.list$D)

#png(file="PairPosteriorPlot_withFRdata.png", width = 1200, height = 1200,res=300)
pairs(postsamples,c("r_V","K_V","r_P","Q","C","D"))

#dev.off()

postsamples2=cbind(out2$BUGSoutput$sims.list$r_V,
                  out2$BUGSoutput$sims.list$K_V,
                  out2$BUGSoutput$sims.list$r_P,
                  out2$BUGSoutput$sims.list$Q,
                  out2$BUGSoutput$sims.list$C,
                  out2$BUGSoutput$sims.list$D)
#png(file="PairPosteriorPlot_withoutFRdata.png", width = 1200, height = 1200, res=300)
pairs(postsamples2,c("r_V","K_V","r_P","Q","C","D"))

#dev.off()

#pdf(file="PairCorrelPosteriorPlot.pdf",width = 4,height = 8)
par(mfrow=c(2,1))
parcorplot(out,parms = c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))
parcorplot(out2,parms = c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))

#dev.off()

This is one way of visualizing the correlations between parameters. One other idea (that we used as well in the previous paper) is to visualize the consequences of those correlations for the estimations of the functional forms

We start by reproducing the Estimated FR for the stochastic FR model with and without the correlations between parameters C and D. Removing the correlations is done by permutations on the values of the Markov chain for both parameters.

We now do the same plot for the deterministic FR model (relying only on count time series data).

Clist = out2$BUGSoutput$sims.list$C
Dlist = out2$BUGSoutput$sims.list$D
n = length(Clist)
ndens = 100
Nprey <- seq(1,20,length=ndens) #density index
FRstoch = matrix(NA,nrow = n, ncol = ndens)

#png('Estimated_FR_withoutFRdata.png',res=300,width=2000,height=1000)
par(mfrow=c(1,2))
library(scales)
for (i in 1:n){
  for (dens in 1:length(Nprey))
  {
    FRstoch[i,dens] = Clist[i]*Nprey[dens]/(Dlist[i]+Nprey[dens])
  }
  if (i == 1){plot(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Functional response',ylim=c(1,3),xlim=c(1,10),xlab='N prey',main='With (C,D) correlations')
  }
  else {lines(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
  lines(Nprey,C*Nprey/(D+Nprey),col=alpha('red',1.0))
}

# what if there was no correlation between parameters? 

Clist = sample(Clist)
Dlist = sample(Dlist)
for (i in 1:n){
  for (dens in 1:length(Nprey))
  {
    FRstoch[i,dens] = Clist[i]*Nprey[dens]/(Dlist[i]+Nprey[dens])
  }
  if (i == 1){plot(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Functional response',ylim=c(1,3),xlim=c(1,10),xlab='N prey',main='Without (C,D) correlations')
  }
  else {lines(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
  lines(Nprey,C*Nprey/(D+Nprey),col=alpha('red',1.0))
}

#dev.off()

Let us now reproduce the density-dependent curves without and without the correlations between parameters. We do this for the prey, results are similar for the predator.

# For the prey
rlist = out$BUGSoutput$sims.list$r_V
Klist = out$BUGSoutput$sims.list$K_V

n = length(rlist)
ndens = 100
Nprey <- seq(1,50,length=ndens) #density index
preyDD = matrix(NA,nrow = n, ncol = ndens)

#png('Estimated_preyDD.png',res=300,width=2000,height=1000)
par(mfrow=c(1,2))
library(scales)
for (i in 1:n){
  for (dens in 1:length(Nprey))
  {
    preyDD[i,dens] = exp(rlist[i])/(1+Nprey[dens]/Klist[i])
  }
  if (i == 1){plot(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Prey growth rate',ylim=c(-1,6),xlim=c(1,50),xlab='N prey',main='With (r,K) correlations')
  }
  else {lines(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
  lines(Nprey,exp(rmax_V)/(1+Nprey/K),col=alpha('red',1.0))
}

# what if there was no correlation between parameters? 
rlist = sample(rlist)
Klist = sample(Klist)
for (i in 1:n){
  for (dens in 1:length(Nprey))
  {
    preyDD[i,dens] = exp(rlist[i])/(1+Nprey[dens]/Klist[i])
  }
  if (i == 1){plot(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Prey growth rate',ylim=c(-1,6),xlim=c(1,50),xlab='N prey',main='Without (r,K) correlations')
  }
  else {lines(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
  lines(Nprey,exp(rmax_V)/(1+Nprey/K),col=alpha('red',1.0))
}

#dev.off()

And now for the deterministic FR model

## without FR data

# For the prey
rlist = out2$BUGSoutput$sims.list$r_V
Klist = out2$BUGSoutput$sims.list$K_V

n = length(rlist)
ndens = 100
Nprey <- seq(1,50,length=ndens) #density index
preyDD = matrix(NA,nrow = n, ncol = ndens)

#png('Estimated_preyDD_withoutFRdata.png',res=300,width=2000,height=1000)
par(mfrow=c(1,2))
library(scales)
for (i in 1:n){
  for (dens in 1:length(Nprey))
  {
    preyDD[i,dens] = exp(rlist[i])/(1+Nprey[dens]/Klist[i])
  }
  if (i == 1){plot(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Prey growth rate',ylim=c(-1,6),xlim=c(1,50),xlab='N prey',main='With (r,K) correlations')
  }
  else {lines(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
  lines(Nprey,exp(rmax_V)/(1+Nprey/K),col=alpha('red',1.0))
}

# what if there was no correlation between parameters? 
rlist = sample(rlist)
Klist = sample(Klist)
for (i in 1:n){
  for (dens in 1:length(Nprey))
  {
    preyDD[i,dens] = exp(rlist[i])/(1+Nprey[dens]/Klist[i])
  }
  if (i == 1){plot(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Prey growth rate',ylim=c(-1,6),xlim=c(1,50),xlab='N prey',main='Without (r,K) correlations')
  }
  else {lines(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
  lines(Nprey,exp(rmax_V)/(1+Nprey/K),col=alpha('red',1.0))
}

#dev.off()

We see that in all cases, the deterministic FR model performs much worse (in spite of the small noise). Moreover, the correlations between the parameters, that could seem worrying (and be signal non-identifiability), actually allow to produce more precise functional forms for both the functional response and the growth rate curve. Therefore, so long as one is aware of those correlations while looking at the point estimates, they may not be a problem.

Frequentist analysis

Here, we fit the same two models (predator-prey model with stochastic FR vs with deterministic FR) to data simulated with the stochastic interaction model. The analysis is therefore identical to that performed in a Bayesian setting.

Likelihood functions

We first define the 2 negative log-likelihood (1 with FR data, 1 without) and 2 sum of squares that will be used in the following.

################# Define the likelihood ########################################
logLik=function(theta,y){
#### y is the data with n rows and 3 columns // log-abundance data + FR data

#### Parameters
# theta1 = r 
# theta2 = gamma
# theta3 = sigma1
# theta4 = s
# theta5 = q 
# theta6 = sigma2
# theta7 = C
# theta8 = D
# theta9 = sigma3

n=nrow(y)
ll = 0.0 ### Or p_1(a_1|x_1) p(x_1)
for (t in 2:n){
N_t = exp(y[t-1,1])
P_t = exp(y[t-1,2]) 
N_tplus1 = exp(y[t,1]) #useful for the functional response

######## Error that was previously there!! N_t/P_t instead P_t/N_t ###
### mu1 = y[t-1,1] + theta[1] - log(1+theta[2]*N_t) - y[t-1,3]*N_t/P_t
######################################################################
mu1 = y[t-1,1] + theta[1] - logprot(1+theta[2]*N_t) - y[t-1,3]*P_t/N_t #this is correct timing
mu2 = y[t-1,2] + theta[4] - logprot((1+theta[5]*P_t/N_t))
mu3 = (theta[7]*N_tplus1)/(theta[8] + N_tplus1) #easier to update all variables simultaneously, minimizes errors
#ll= ll + log(dnorm(y[t,1], mu1, theta[3])) +  log(dnorm(y[t,2], mu2, theta[6])) + log(dnorm(y[t,3], mean = mu3, sd = theta[9]))
# we have log(0) problem
d1=dnorm(y[t,1], mu1, theta[3],log=T) ## directly asking for the log avoids problems
d2=dnorm(y[t,2], mu2, theta[6],log=T)
d3=dnorm(y[t,3], mu3, theta[9],log=T)
ll=ll+d1+d2+d3
}
return(-ll)
}


############### Working directly with the sum of squares ######################
RSS=function(theta,y){
  #### y is the data with n rows and 3 columns // log-abundance data + FR data
  
  #### Parameters
  # theta1 = r 
  # theta2 = gamma
  # theta3 = s
  # theta4 = q 
  # theta5 = C
  # theta6 = D
  # not the same theta
  
  n=nrow(y)
  rss = 0.0 ### Or p_1(a_1|x_1) p(x_1)
  for (t in 2:n){
    N_t = exp(y[t-1,1])
    P_t = exp(y[t-1,2]) 
    N_tplus1 = exp(y[t,1]) #useful for the functional response
    ############## Correction of error ###########################################
    ## mu1 = y[t-1,1] + theta[1] - log(1+theta[2]*N_t) - y[t-1,3]*N_t/P_t # error
    mu1 = y[t-1,1] + theta[1] - logprot(1+theta[2]*N_t) - y[t-1,3]*P_t/N_t # corrected
    mu2 = y[t-1,2] + theta[3] - logprot((1+theta[4]*P_t/N_t))
    mu3 = (theta[5]*N_tplus1)/(theta[6] + N_tplus1)
    rss=rss+(y[t,1] - mu1)^2+(y[t,2]-mu2)^2+(y[t,3]-mu3)^2
  }
  return(rss)
}

############################### LL ##################################################

################# Define the likelihood ########################################

logLik_FRwoutNoise=function(theta,y){
#### y is the data with n rows and 3 columns // log-abundance data + FR data

#### Parameters
# theta1 = r 
# theta2 = gamma
# theta3 = sigma1
# theta4 = s
# theta5 = q 
# theta6 = sigma2
# theta7 = C
# theta8 = D
# theta9 = sigma3

n=nrow(y)
ll = 0.0 ### Or p_1(a_1|x_1) p(x_1)
for (t in 2:n){
N_t = exp(y[t-1,1])
P_t = exp(y[t-1,2]) 

######## Error that was previously there!! N_t/P_t instead P_t/N_t ###
### mu1 = y[t-1,1] + theta[1] - log(1+theta[2]*N_t) - y[t-1,3]*N_t/P_t
######################################################################
mu1 = y[t-1,1] + theta[1] - logprot(1+theta[2]*N_t) - (theta[7]*P_t)/(theta[8] + N_t)
mu2 = y[t-1,2] + theta[4] - logprot((1+theta[5]*P_t/N_t))
#ll= ll + log(dnorm(y[t,1], mu1, theta[3])) +  log(dnorm(y[t,2], mu2, theta[6])) + log(dnorm(y[t,3], mean = mu3, sd = theta[9]))
# we have log(0) problem
d1=dnorm(y[t,1], mu1, theta[3],log=T) ## directly asking for the log avoids problems
d2=dnorm(y[t,2], mu2, theta[6],log=T)
#d3=dnorm(y[t,3], mu3, theta[9],log=T)
ll=ll+d1+d2#+d3
}
return(-ll)
}


############### Working directly with the sum of squares ######################
RSS_FRwoutNoise=function(theta,y){
  #### y is the data with n rows and 3 columns // log-abundance data + FR data
  
  #### Parameters
  # theta1 = r 
  # theta2 = gamma
  # theta3 = s
  # theta4 = q 
  # theta5 = C
  # theta6 = D
  # not the same theta
  
  n=nrow(y)
  rss = 0.0 ### Or p_1(a_1|x_1) p(x_1)
  for (t in 2:n){
    N_t = exp(y[t-1,1])
    P_t = exp(y[t-1,2]) 
    ############## Correction of error ###########################################
    ## mu1 = y[t-1,1] + theta[1] - log(1+theta[2]*N_t) - y[t-1,3]*N_t/P_t # error
    mu1 = y[t-1,1] + theta[1] - logprot(1+theta[2]*N_t) -  (theta[5]*P_t)/(theta[6] + N_t) # corrected
    mu2 = y[t-1,2] + theta[3] - logprot((1+theta[4]*P_t/N_t))
    rss=rss+(y[t,1] - mu1)^2+(y[t,2]-mu2)^2#+(y[t,3]-mu3)^2
  }
  return(rss)
}

Maximum likelihood and Hessian computation

We then compute the maximum likelihood, starting reasonably away from it.

### We define the data
### Produce dataset to fit
data = cbind(log(N),log(P),FR) 

######### Estimation starting at the MLE ###########
theta_true  = c(rmax_V,1/K,sqrt(0.05),rmax_P,Q,sqrt(0.05),C,D,sqrt(0.05))
#theta_init = theta_true + rnorm(9,0,sd=0.05) # very small errors
### Crafting a new theta-init with reasonable a priori errors
theta_init = c(3,0.5,0.75,1,5,0.75,4,5,0.75)

p_opt<-optim(theta_init, logLik, y=data,method="BFGS",hessian=T)
p_opt$par
## [1]  2.0298182  1.0349005  0.2225043  0.5067403 10.3093055  0.2314224
## [7]  2.5146359  1.0202974  0.2194918
theta_true
## [1]  2.0000000  1.0000000  0.2236068  0.5000000 10.0000000  0.2236068
## [7]  2.5000000  1.0000000  0.2236068
hessian=p_opt$hessian
eigen(hessian)$values ## clearly no zeroes there? 
## [1] 41477.939825 40362.281312 37310.999154 33665.713773 18679.901299
## [6] 16862.113820    75.108327    12.352505     1.498253
lambda1 = max(eigen(hessian)$values)
9*lambda1*10^(-9) #way below lowest eigenvalue
## [1] 0.0003733015

Looking at the correlations in the matrix

Sigma =solve(hessian) #variance-covariance matrix
Rho = cov2cor(Sigma) #correlation matrix
library(corrplot)
par(mfrow=c(1,1))
rownames(Rho) = c("r_V","1/K_V","sigma1","r_P","Q","sigma2","C","D","sigma3")
colnames(Rho) = c("r_V","1/K_V","sigma1","r_P","Q","sigma2","C","D","sigma3")
corrplot(Rho, method="circle")

corrplot(Rho, method = "number")

# There are significant correlations between pairs of parameters (same as in the Bayesian analysis)

We recover the same correlations as for the Bayesian analysis. Let’s see if we can get rid of the sigma parameters that may make the optimization more difficult while adding little to the estimation of the other parameters:

######## Estimation based on RSS ##################

### New restricted theta
theta_true  = c(rmax_V,1/K,rmax_P,Q,C,D)
#theta_init = theta_true + rnorm(6,0,sd=0.05)
theta_init = c(3,0.5,1,5,4,5)
p_opt<-optim(theta_init, RSS, y=data,hessian=T)
p_opt$par
## [1] 2.9229583 2.8012547 0.4128283 7.9391372 2.5043341 0.9945316
theta_true
## [1]  2.0  1.0  0.5 10.0  2.5  1.0
theta_init
## [1] 3.0 0.5 1.0 5.0 4.0 5.0
eigen(p_opt$hessian)$values ### clearly interesting and non zero
## [1] 2221.3231218 2001.4856462 1638.0325684    7.4255103    0.2789953
## [6]   -0.1160630

From this we can get (approximate) confidence intervals [which we will plot in a similar way to the Bayesian a posteriori estimates]. [do we use approximate Gaussian or smthg else? Some people use \(\chi_2\) based on the distribution on the deviance scale.]

###################################################################################################
################### Now for the model where the FR is specified without the noise #################
###################################################################################################
# Revert back to full theta_true with 9 elements
# Nope!! There are only 8 elements here, no sigma3!!

theta_true  = c(rmax_V,1/K,sqrt(0.05),rmax_P,Q,sqrt(0.05),C,D)
### Crafting a new theta-init with reasonable a priori errors
theta_init = c(3,0.5,0.75,1,5,0.75,4,5)

p_opt1<-optim(theta_init, logLik_FRwoutNoise,method="BFGS", y=data,hessian=T)
p_opt1$par
## [1]  1.5718984  0.5730893  0.2226600  0.5067040 10.3083510  0.2314212
## [7] 26.0655759 53.0488133
theta_true
## [1]  2.0000000  1.0000000  0.2236068  0.5000000 10.0000000  0.2236068
## [7]  2.5000000  1.0000000
theta_init
## [1] 3.00 0.50 0.75 1.00 5.00 0.75 4.00 5.00
eigen(p_opt1$hessian)$values ### BFGS
## [1] 5.505953e+04 4.030562e+04 3.731199e+04 1.868010e+04 4.707488e+01
## [6] 1.498589e+00 8.201987e-02 1.141953e-04
lambda1 = max(eigen(p_opt1$hessian)$values)
8*lambda1*10^(-9) ### 0.0005090474 -- close to zero. Vallefond et al.'s criterion. 
## [1] 0.0004404762
diag(solve(p_opt1$hessian)) ## positive values though
## [1] 2.021508e-02 1.120156e-02 2.481616e-05 1.005341e-03 6.663426e-01
## [6] 2.680104e-05 1.476976e+03 7.292133e+03
### Looks like BFGS performs worse than Nelder-Mead without the FR data (see below)

p_opt2<-optim(theta_init, logLik_FRwoutNoise, y=data,hessian=T)
p_opt2$par
## [1] 2.1948736 1.2279028 0.2206843 0.3550715 6.5116240 0.2309290 4.5669852
## [8] 5.1584874
eigen(p_opt2$hessian)$values ### Nelder-Mead by default
## [1] 42784.648389 39426.476918 30604.638451 18749.117486    46.054877
## [6]     3.689130     2.639528    -0.111515
## Beware that we use -LL, so Hessian = observed FIM (Fisher Information Matrix)
## Should we loop over datasets to obtain the true, expected FIM?

Likelihood surfaces

We plot these for two parameters at a time, based on the pairs of correlated parameters identified in the preceding section.

theta_true  = c(rmax_V,1/K,sqrt(0.05),rmax_P,Q,sqrt(0.05),C,D,sqrt(0.05))

par(mfrow=c(1,1))
# basic informal check for C and D
logLik(theta_true,data)
##           
## -223.5108
### Let's make that zoom precise 
DeltaC = 0.25 # C is +/- 2
DeltaD = 0.5 #D is +/-0.9
niter = 50
C_new=D_new=rep(0,niter)
llbis=matrix(0,nrow=niter,ncol=niter)
for (i in 1:niter){
  for (j in 1:niter){
    C_new[i] =2*DeltaC*i/niter-DeltaC
    D_new[j] = 2*DeltaD*j/niter-DeltaD
    theta_new=theta_true+c(0,0,0,0,0,0,C_new[i],D_new[j] ,0)
    llbis[i,j]=logLik(theta_new,data)
  }
}
#hist(llbis) ## what values are in there
min(llbis)
## [1] -223.7919
custom_levels=c(500,100,0,-50,-200,-210,-215,-220,-225)
#contour(theta_true[7]+C_new,theta_true[8]+D_new,llbis,nlevels=20,xlab="C",ylab="D")
contour(theta_true[7]+C_new,theta_true[8]+D_new,llbis,levels=custom_levels,xlab="C",ylab="D")

### Try to vizualise this using a 3D surface plot
library(rgl)
interval = (niter/2-10):(niter/2+10)
custom_levels=c(1000,500,100,50,0,-50,-100,-150,-200,-210,-215,-220)
persp3d(theta_true[7]+C_new[interval],theta_true[8]+D_new[interval],-llbis[interval,interval], col="skyblue")
### Maximum not visible on the surface. 

It is not extremely well defined but there is a maximum

################# Now with (r,gamma=1/K)
logLik(theta_true,data)
##           
## -223.5108
niter = 50
r_new=g_new=rep(0,niter)
llbis=matrix(0,nrow=niter,ncol=niter)
Deltar = 1.5 # rmax_V is +/- 1.5
Deltag = 0.9 #1/K is +/- 0.9
for (i in 1:niter){
  for (j in 1:niter){
    r_new[i] =2*Deltar*i/niter-Deltar
    g_new[j] = 2*Deltag*j/niter-Deltag
    theta_new=theta_true+c(r_new[i],g_new[j],0,0,0,0,0,0)
    llbis[i,j]=logLik(theta_new,data)
  }
}
#contour(theta_true[1]+r_new,theta_true[2]+g_new,llbis,nlevels=50,xlab="r",ylab="1/K")
#changing the representation of the levels
hist(llbis) ## what values are in there

custom_levels=c(80000,60000,40000,20000,5000,1000,500,100,50,0,-10,-20-30)
contour(theta_true[1]+r_new,theta_true[2]+g_new,llbis,levels=custom_levels,xlab="r",ylab="1/K")

Same here, there is a maximum even though it is along a ridge.

########### Other profiles #####################################
# We have (r_V, 1/K) and (C,D) - we also need (r_P,Q)
################# Now with (r,gamma=1/K)

niter = 50
rP_new=q_new=rep(0,niter)
llbis=matrix(0,nrow=niter,ncol=niter)
Deltar = 0.3 # rmax_P is +/- 0.3
Deltaq = 2 # Q is +/- 2
for (i in 1:niter){
  for (j in 1:niter){
    rP_new[i] =2*Deltar*i/niter-Deltar
    q_new[j] = 2*Deltaq*j/niter-Deltaq
    theta_new=theta_true+c(0,0,0,rP_new[i],q_new[j],0,0,0,0)
    llbis[i,j]=logLik(theta_new,data)
  }
}
contour(theta_true[4]+rP_new,theta_true[5]+q_new,llbis,nlevels=50,xlab="rP",ylab="Q")

#changing the representation of the levels
hist(llbis) ## what values are in there

min(llbis)
## [1] -223.8116
custom_levels=quantile(llbis,probs=c(0.025,0.05,0.075,0.1,0.25,0.5,0.75),na.rm=T)
contour(theta_true[4]+rP_new,theta_true[5]+q_new,llbis,levels=custom_levels,xlab="rP",ylab="Q")

We can find similar results by working directly with the sum of squares.

(I have previously called these 2D plots likelihood profiles but may have be wrong about this since “profiles” refer to ridgelines, cf. Bolker EMDB chap. 6 p. 186. A likelihood profile is computed by fixing one parameter, say r, then optimizing -LL for all other parameters, and finally plotting the optimized -LL against r. If it’s flat, then this parameter is not identifiable. There’s some debate in identifiability papers (cf. Raue et al. 2009; Kao & Eisenberg 2018) about how useful are these profiles.)

As does Bolker p. 190 Fig. 6.8, we can fit both the likelihood “slice” and the likelihood profile on the same plot. [Btw, he does that for the attack rate of a functional response!]

Bayesian analysis II - reparameterization of the model

The model was parameterized so far as often done in the literature (e.g. Weide et al.) but not exactly in terms of carrying capacities for the prey (this can be found by computing the equilibrium population size for the Beverton-Holt model, which is not equal to K in the previous formulation). Here we attempt a reparameterization to decrease the correlation between pairs of parameters belonging to the same function in the model. In this new formulation, we have \(N^*=K\) for the prey when considered alone and \(P^* = N/q\) for the predator alone if the prey is fixed. We also consider a classic \((a,h)\) parameterization of the functional response.

\[ \begin{aligned} N_{t+1} & = N_{t}\frac{e^{r+\epsilon_{1t}}}{1+(e^r-1) N_{t}/K}\exp\left(-G(N_{t})\frac{P_{t}}{N_{t}}\right),\,\epsilon_{1t}\sim\mathcal{N}(0,\sigma_{1}^{2})\label{eq:prey_discreteLeslieMay_reparam} \end{aligned} \]

and

\[ \begin{aligned} P_{t+1} & = P_{t}\frac{e^{s+\epsilon_{2t}}}{1+(e^s-1)qP_{t}/N_{t}},\,\epsilon_{1t}\sim\mathcal{N}(0,\sigma_{2}^{2})\label{eq:predator_discreteLeslieMay_reparam} \end{aligned} \]

Correlation between parameters with re-parameterization

Let us first produce a few diagnostics, starting with posterior densities

For the stochastic FR model

denplot(out,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))

For the deterministic FR model

denplot(out2,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))

Now Trace plots

#png(file="TracePlot_withFRdata_reparam.png", width = 1200, height = 1200,res=300)
traplot(out,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))

#dev.off()
#png(file="TracePlot_withoutFRdata_reparam.png", width = 1200, height = 1200,res=300)
traplot(out2,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))

#dev.off()

We now examine correlations for the stochastic FR model

### Plot pair posterior densities
postsamples=cbind(out$BUGSoutput$sims.list$r_V,
out$BUGSoutput$sims.list$K_V,
out$BUGSoutput$sims.list$r_P,
out$BUGSoutput$sims.list$Q,
out$BUGSoutput$sims.list$a,
out$BUGSoutput$sims.list$h)
#png(file="PairPosteriorPlot_withFRdata_reparam.png", width = 1200, height = 1200,res=300)
pairs(postsamples,c("r_V","K_V","r_P","Q","a","h"))

#dev.off()

and for the deterministic FR model

postsamples2=cbind(out2$BUGSoutput$sims.list$r_V,
                  out2$BUGSoutput$sims.list$K_V,
                  out2$BUGSoutput$sims.list$r_P,
                  out2$BUGSoutput$sims.list$Q,
                  out2$BUGSoutput$sims.list$a,
                  out2$BUGSoutput$sims.list$h)
#png(file="PairPosteriorPlot_withoutFRdata_reparam.png", width = 1200, height = 1200, res=300)
pairs(postsamples2,c("r_V","K_V","r_P","Q","a","h"))

#dev.off()

Let us consider other ways of plotting

#pdf(file="PairCorrelPosteriorPlot_reparam.pdf",width = 4,height = 8)
par(mfrow=c(2,1))
parcorplot(out,parms = c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))
parcorplot(out2,parms = c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))

#dev.off()

We now reproduce the FR curve without and without the correlations between parameters. The red line is again the true functional response.

For the stochastic FR model

and for the model without the FR data

alist = out2$BUGSoutput$sims.list$a
hlist = out2$BUGSoutput$sims.list$h
n = length(Clist)
ndens = 100
Nprey <- seq(1,20,length=ndens) #density index
FRstoch = matrix(NA,nrow = n, ncol = ndens)

#png('Estimated_FR_withoutFRdata_reparam.png',res=300,width=2000,height=1000)
par(mfrow=c(1,2))
library(scales)
for (i in 1:n){
  for (dens in 1:length(Nprey))
  {
    FRstoch[i,dens] = alist[i]*Nprey[dens]/(1.0+alist[i]*hlist[i]*Nprey[dens])
  }
  if (i == 1){plot(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Functional response',ylim=c(1,3),xlim=c(1,10),xlab='N prey',main='With (a,h) correlations')
  }
  else {lines(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
  lines(Nprey,a*Nprey/(1+a*h*Nprey),col=alpha('red',1.0))
}

# what if there was no correlation between parameters? 

alist = sample(alist)
hlist = sample(hlist)
for (i in 1:n){
  for (dens in 1:length(Nprey))
  {
    FRstoch[i,dens] = alist[i]*Nprey[dens]/(1.0+alist[i]*hlist[i]*Nprey[dens])
  }
  if (i == 1){plot(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Functional response',ylim=c(1,3),xlim=c(1,10),xlab='N prey',main='Without (a,h) correlations')
  }
  else {lines(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
  lines(Nprey,a*Nprey/(1+a*h*Nprey),col=alpha('red',1.0))
}

#dev.off()

Now we do this for the density-dependent curves without and without the correlations between parameters

and without FR data

We see that the reparameterization might have helped to make these curves a little more precise, though not overly so. The parameter values and hence the model behaviour is rigorously identical to that of the first parameterization.

Discussion

Identifiability

It seems that the model is globally identifiable when FR data is present. However, for the first parameterization, pairs of parameters belonging to the same functional forms of the models are highly correlated. These pairs are, respectively, (r K), (s, Q), and (C,D). Plots of the functions realized for each parameter pair show that such correlations are substantial for T=1000.

Reparameterizing so that K and Q are more closely related to carrying capacities improved substantially decreased parameter correlations [I would not go as far as to say it improved identifiability, since the models seemed to be already identifiable in the first place]. Whilst Bolker (EMDB book, p. ) suggested that a Michaelis-Menten formulation (\(CN/(D+N)\)) of the FR may improve identifiability compared to the classical \((a,h)\) formulation (which was our guess as well), we did not find this here: correlation seems slightly lower of the \((a,h)\) pair.

[Another issue that we may want to convey here: Bolker mentions in his book p. 200 that “all other things being equal, smaller confidence regions (i.e, for larger and less noisy datasets and for higher alpha levels) are more elliptical”. Meaning that the less noise there is and the longer the time series, the more correlations we get between the parameters. Could that be true here? Check T=100 and less.]

The absence of FR data substantially compromises identifiability. This is all the more important that we consider here a fairly small noise on the functional response and relatively rich datasets. [more here based on what we find – do profile likelihoods for these?]

(perhaps discussion of the work by Cole et al. 2010 – exhaustive summaries are difficult to obtain here so we cannot do symbolic work. Discuss also the subset profiling of Eisenberg & Hayashi (2014))

Maximum likelihood algorithms

[BFGS vs Nelder-Mead]

[Try likelihood surfaces and profiles for the case without FR data??]

Stochastic or deterministic FR in absence of FR data?

[try model with noise on the FR even without FR data? I think I tried this once and it did not work very well, but on the other hand we had some bugs in the code]

Future avenues for modelling development

  • Inserting demographic information together with functional response data.
  • [do I mention more complex FR here or in a Supplement + before]
  • Different functional responses, e.g. Ivlev-type \(G(N) = C(1-\exp(-\ln(2)\frac{N}{D})\). 2 things may be interesting :
    • Simulating with Ivlev and fitting with Ivlev
    • Simulating with Holling and fitting with Ivlev, though beware that (i) we wont be able to match exactly the parameters because the shapes of different (ii) small differences in FR shape can induce large differences in model behaviour sensu Fussmann & Blasius (2005) .
  • Predator-dependent or multiple-speces functional responses – later. All the more important that we predator-dependence can emerge from the discreteness of time.
  • Difficulties in measuring the functional response and taking into replacement [Juliano-type models etc.]. Discuss perhaps the chapter 6 of Bolker (2008) where he does this on count data - does he take into account depletion in the original article? If not, why not?
  • Other kinds of interactions models. Perhaps cases with more interaction data and less time series (mutualisms?).

Appendices

Stability analysis of the deterministic model and stochastic implications

Noisy limit cycle parameter set

Supplements (online)

Estimation and identifiability of a Rosenzweig-MacArthur type of model

More stochastic functional responses

Large noise cases, added (temporal) variability in parameters \(C\) or \(D\)

References

Barraquand, F. & Nielsen, Ò.K. (2018). Predator-prey feedback in a gyrfalcon-ptarmigan system? Ecology and Evolution, 8, 12425–12434.

Besbeas, P., Freeman, S.N., Morgan, B.J. & Catchpole, E.A. (2002). Integrating mark–recapture–recovery and census data to estimate animal abundance and demographic parameters. Biometrics, 58, 540–547.

Bolker, B. (2008). Ecological models and data in r. Princeton University Press.

Cole, D.J., Morgan, B.J. & Titterington, D. (2010). Determining the parametric structure of models. Mathematical biosciences, 228, 16–30.

DeLong, J.P., Hanley, T.C., Gibert, J.P., Puth, L.M. & Post, D.M. (2018). Life history traits and functional processes generate multiple pathways to ecological stability. Ecology, 99, 5–12.

Eisenberg, M.C. & Hayashi, M.A. (2014). Determining identifiable parameter combinations using subset profiling. Mathematical biosciences, 256, 116–126.

Ferguson, J.M., Hopkins III, J.B. & Witteveen, B.H. (2018). Integrating abundance and diet data to improve inferences of food web dynamics. Methods in Ecology and Evolution, 9, 1581–1591.

Fussmann, G. & Blasius, B. (2005). Community response to enrichment is highly sensitive to model structure. Biology Letters, 1, 9–12.

Gilioli, G., Pasquali, S. & Ruggeri, F. (2008). Bayesian inference for functional response in a stochastic predator–prey system. Bulletin of Mathematical Biology, 70, 358–381.

Gilioli, G., Pasquali, S. & Ruggeri, F. (2012). Nonlinear functional response parameter estimation in a stochastic predator-prey model. Math. Biosci. Eng, 9, 75–96.

Jost, C. & Arditi, R. (2001). From pattern to process: Identifying predator–prey models from time-series data. Population Ecology, 43, 229–243.

Kao, Y.-H. & Eisenberg, M.C. (2018). Practical unidentifiability of a simple vector-borne disease model: Implications for parameter estimation and intervention assessment. Epidemics, 25, 89–100.

Leslie, P. (1948). Some further notes on the use of matrices in population mathematics. Biometrika, 35, 213–245.

Leslie, P. & Gower, J. (1960). The properties of a stochastic model for the predator-prey type of interaction between two species. Biometrika, 47, 219–234.

Maunder, M.N. (2004). Population viability analysis based on combining bayesian, integrated, and hierarchical analyses. Acta Oecologica, 26, 85–94.

May, R. (1973). Stability and complexity in model ecosystems. Princeton University Press, Princeton, USA.

Niu, S., Luo, Y., Dietze, M.C., Keenan, T.F., Shi, Z. & Li, J.et al. (2014). The role of data assimilation in predictive ecology. Ecosphere, 5, art65.

Péron, G. & Koons, D.N. (2012). Integrated modeling of communities: Parasitism, competition, and demographic synchrony in sympatric ducks. Ecology, 93, 2456–2464.

Planque, B., Lindstrøm, U. & Subbey, S. (2014). Non-deterministic modelling of food-web dynamics. PloS one, 9, e108243.

Raue, A., Kreutz, C., Maiwald, T., Bachmann, J., Schilling, M. & Klingmüller, U.et al. (2009). Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics, 25, 1923–1929.

Rosenbaum, B. & Rall, B.C. (2018). Fitting functional responses: Direct parameter estimation by simulating differential equations. Methods in Ecology and Evolution, 9, 2076–2090.

Rosenbaum, B., Raatz, M., Weithoff, G., Fussmann, G.F. & Gaedke, U. (2019). Estimating parameters from multiple time series of population dynamics using bayesian inference. Frontiers in Ecology and Evolution, 6, 234.

Tanner, J. (1975). The stability and the intrinsic growth rates of prey and predator populations. Ecology, 56, 855–867.

Turchin, P. & Ellner, S. (2000). Living on the edge of chaos: population dynamics of Fennoscandian voles. Ecology, 81, 3099–3116.

Turchin, P. & Hanski, I. (1997). An empirically based model for latitudinal gradient in vole population dynamics. American Naturalist, 149, 842–874.

Weide, V., Varriale, M.C. & Hilker, F.M. (2018). Hydra effect and paradox of enrichment in discrete-time predator-prey models. Mathematical Biosciences.